Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make MASTERDOWN a retriable error in RedisCluster client #3164

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

justinmir
Copy link

When clusters are running with replica-server-stale-data no, replicas will return a MASTERDOWN error under two conditions:

  1. The primary has failed and we are not serving requests.
  2. A replica has just started and has not yet synced from the primary.

The former, primary has failed and we are not serving requests, is similar to a CLUSTERDOWN error and should be similarly retriable.

When a replica has just started and has not yet synced from the primary the request should be retried on other available nodes in the shard. Otherwise a percentage of the read requests to the shard will fail.

Examples when replica-server-stale-data no is enabled:

  1. In a cluster using ReadOnly with a single read replica, every read request will return errors to the client because MASTERDOWN is not a retriable error.
  2. In a cluster using RouteRandomly a percentage of the requests will return errors to the client based on if this server was selected.

When clusters are running with `replica-server-stale-data no`, replicas
will return a MASTERDOWN error under two conditions:
  1. The primary has failed and we are not serving requests.
  2. A replica has just started and has not yet synced from the primary.

The former, primary has failed and we are not serving requests, is
similar to a CLUSTERDOWN error and should be similarly retriable.

When a replica has just started and has not yet synced from the primary
the request should be retried on other available nodes in the shard.
Otherwise a percentage of the read requests to the shard will fail.

Examples when `replica-server-stale-data no` is enabled:
  1. In a cluster using `ReadOnly` with a single read replica, every
     read request will return errors to the client because MASTERDOWN is
     not a retriable error.
  2. In a cluster using `RouteRandomly` a percentage of the requests
     will return errors to the client based on if this server was
     selected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant